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Abstract. We show that Csanky's fast parallel algorithm for computing 
the characteristic polynomial of a matrix can be formalized in the logical 
theory LAP, and can be proved correct in LAP from the principle 
of linear independence. LAP is a natural theory for reasoning about 
linear algebra introduced in |Hj. Further, we show that several principles 
of matrix algebra, such as linear independence or the Cayley-Hamilton 
Theorem, can be shown equivalent in the logical theory QLA. Applying 
the separation between complexity classes AC°[2] C DET(GF(2)), we 
show that these principles are in fact not provable in QLA. In a nutshell, 
we show that linear independence is "all there is" to elementary linear 
algebra (from a proof complexity point of view), and furthermore, linear 
independence cannot be proved trivially (again, from a proof complexity 
point of view) . 
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1 Introduction 

This paper makes the following claim: our intuition that the principle of linear 
independence is all that there is to elementary linear algebra is justified from 
a proof complexity point of view. This means that from the principle of linear 
independence we can prove other strong principles of linear algebra (for example, 
the Cayley-Hamilton Theorem) using concepts of very low computational com- 
plexity. Furthermore, we claim that linear independence itself cannot be proved 
using concepts of low computational complexity. 

To argue this claim, we present a new feasible proof of the Cayley-Hamilton 
Theorem (CHT) from the principle of linear independence in a weak theory of 
linear algebra (QLAP). The proof is based on Csanky's algorithm for computing 
the characteristic polynomial of a matrix. Csanky's algorithm is a fast parallel 
algorithm that computes the characteristic polynomial of a matrix over fields of 
characteristic zero. 

QLAP is a first order theory for reasoning about matrices. Our new proof 
of the CHT with Csanky's algorithm leads to QLAP proofs of equivalence of 



important principles of linear algebra (for example, linear independence and the 
CHT). We also show that these principles are independent of QLAP. To show 
this independence we use the previously known result that AC [2] is properly 
contained in DET(GF(2)). 

The class AC [2] consists of problems solvable with polynomial size circuits 
(in the size of the input), bounded depth, where besides the usual gates {A, V, ->} 
we are also allowed to use the parity gate ©. The class DET(GF(2)) consists 
of problems AC reducible to computing the determinant over the field of two 
elements. Another class which will make a frequent appearance in this paper is 
NC 2 , which consists of those problems which are solvable with polynomial size 
circuits of depth 0(log 2 ) (in the size of the input). 

It is known that AC [2] C DET(GF(2)) C NC 2 C PolyTime, and the sep- 
aration between the first two complexity classes is the famous result of Razborov 
and Smolensky ([IE])). This separation will be instrumental in showing our in- 
dependence result in the last section. 

In this line of research we are motivated by a dual purpose: we want to 
understand the proof complexity of linear algebra, and we are also searching for 
good candidates for separating the Frege and extended Frege prepositional proof 
systems. This separation is a central problem in theoretical computer science, 
and the theorems of universal linear algebra are considered to be good candidates 
to show such a separation — see PQ for more background on this quest. 

In jS] we introduced the logical theory LA C LAP C 3LA we gave the 
first feasible (i.e., using polynomial time concepts) proof of the CHT, a central 
theorem of matrix algebra from which many other universal theorems follow (in 
LAP). Our proof was based on Berkowitz's algorithm, which is an efficient paral- 
lel algorithm for computing the characteristic polynomial of a matrix (and hence 
the inverse, adjoint, and determinant of a matrix). Berkowitz's algorithm is field 
independent (that is, it works over any field), and it can be formalized with 
NC 2 circuits. Both Berkowitz's algorithm and Csanky's algorithm are NC 2 al- 
gorithms, and have the following interesting relationship: if they could be shown 
to compute the same thing in LAP, they could both be shown correct in LAP. 
As things stand now, are best proofs of correctness for both are polytime. 

In section |21 we describe the relevant theories, LA, LAP, QLA, and 3LA. 
In section [3] we describe Csanky's and Berkowitz's algorithms, and show that 
they can be formalized in LAP. In section 01 we show that the CHT follows in 
LAP from the principle of linear independence. This result is obtained using 
Csanky's algorithm, and so it requires fields of characteristic zero. In section 
we show that five main principles of linear algebra can all be shown equivalent 
in QLA, and furthermore, QLA does not prove any of them. 

2 The theories LA, LAP, 3LA, and QLA 

We define a quantifier-free theory of Linear Algebra (matrix algebra), and call it 
LA. Our theory is strong enough to prove the ring properties of matrices such as 
A(BC) = (AB)C and A + B = B + A but weak enough so that all the theorems 



of LA (over finite fields or the field of rationals) translate into propositional 
tautologies with short Frege proofs. 

Our theory has three sorts of object: indices (i.e., natural numbers), field 
elements, and matrices, where the corresponding variables are denoted i,j, k, . . .; 
a, b,c, . . .; and A, B, C, . . ., respectively. The semantics assumes that objects of 
type field are from a fixed but arbitrary field, and objects of type matrix have 
entries from that field. 

Terms and formulas are built from the function and predicate symbols: 

Oindcx; lindcx; "Wndcx; *indcx; index; dlV, rem, O ne ld; 1 field : 
~t~ficld; *ficld; field; C, 6, Xi , lindcx; index; field; 

=matrix, COnd in dex, COndfield 

The intended meanings should be clear, except for the following operations on 
a matrix A: t(A),c(A) are the numbers of rows and columns in A, e(A,i,j) is 
the field element Ay, ^(^4) is the sum of the elements in A. Also cond(a, t\, £2) 
is interpreted if a then t\ else £2, where a is a formula all of whose atomic sub- 
formulas have the form m < n or m = n, where m, n are terms of type index, 
and ti , t2 are terms either both of type index or both of type field. The subscripts 
index and ne id are usually omitted, since they are clear from the context. 

In addition to the usual rules for constructing terms we also allow the terms 
Xij{m,n,t) of type matrix. Here i and j are variables of type index bound by 
the A operator, intended to range over the rows and columns of the matrix. Here 
also m, n are terms of type index not containing i, j (representing the numbers 
of rows and columns of the matrix) and t is a term of type field (representing 
the matrix element in position 

The A terms allow us to construct the sum, product, transpose, etc., of ma- 
trices. For example, suppose first that A and B are mxn matrices. Then, A + B 
can be defined as Xij (m,n, e(A,i, j) + e(B,i,j)). Now suppose that A and B 
are m x p and p x 11 matrices, respectively. Then: 

A * B := Xij(m, n, SXkl(p, 1, e(A, i, k) * e(B, k,j))) 

However, even if matrices are of incompatible size, their addition and product 
is well defined, since the "smaller" matrix is implicitly padded with zeros (as 
e(A,i,j) = for i or j outside the range). Thus, all terms are well defined. 

Atomic formulas and formulas are built in the usual manner, but in LA and 
LAP wc only allow bounded index quantifiers (note that LA, respectively LAP, 
with bounded index quantifiers is conservative over LA, respectively LAP, with- 
out them). 

We use Gentzen's sequent calculus LK (with quantifier rules omitted) for the 
underlying logic. We include 34 non-logical axioms in four groups: Axioms for 
equality, indices, field elements, and matrices (all quantifier- free) . These specify 
the basic properties of the function and predicate symbols By convention 
each instance of an axiom resulting from substituting terms for variables is also 
an axiom, so the axioms are really axiom schemes. All the axioms are given 
in 0. 



(1) 



We need an extra axiom to ensure that the underlying field is of charac- 
teristic zero. This can be stated with SI n ^ 0, where /„ is the n x n identity 
matrix, which is given with a constructed term Xij(n, n, cond(i = j, 1, 0)). This 
requirement is necessary for Csanky's algorithm which works only over fields of 
characteristic zero, as it performs divisions by integers. 

We need just two non-logical rules: an equality rule for terms of type matrix, 
and the induction rule: 

r,a(i)^a(i + l),A 
r,a(0) -> a{n),A 1 ' 

To formalize Newton's and Berkowitz's algorithms we extend the theory LA 
to the theory LAP by adding a new function symbol P, where P(n, A) means 
A" . We also add two new axioms, which give a recursive definition of P; namely, 
P(0, A) = I and P(n + 1, A) = P(n,A) * A. This is enough to formalize the 
coefficients of the characteristic polynomial of a matrix, as computed by either 
algorithm, as terms in the language of LAP. However, it seems that LAP is 
too weak to prove strong properties of the characteristic polynomial (such as the 
CHT or the multiplicativity of the determinant). 

The theory 3LA is an extension of LA where we allow induction over formu- 
las of the form (3X < t)a, where a has no quantifiers, and 3X < t is a bounded 
existential matrix quantifier (X < t is just shorthand for r(X) < t A c(X) < t). 
Note that the theory 3LAP, defined analogously, is conservative over 3LA be- 
cause matrix powering (P) can be defined in 3LA; so we don't really need to 
include P (see [TO]). 

Finally, QLA is LA with quantification over matrices, but induction re- 
stricted to formulas of LA. 

This concludes a brief tour through the theories LA, LAP, 3LA, and QLA. 
They are natural theories, in that they include what one would expect to formal- 
ize matrix algebra. LA is the weakest, and it can be thought off as the theory 
that proves the ring properties of matrices. LAP is LA together with the ma- 
trix powering function (and defining axioms), and it can formalize Csanky's and 
Berkowitz's algorithm, but it seems too weak to prove strong properties about 
them. 3LA is LA together with an induction over formulas with bounded matrix 
quantifiers (which also allows it to simulate LAP). 



3 Csanky's and Berkowitz's algorithms 

Both Csanky's and Berkowitz's algorithms compute the characteristic polyno- 
mial of a matrix, which is usually defined as pa(x) = det(a;/ — A), for a given 
matrix A. Let p^ ANKY and p^ RK denote the coefficients of the characteristic poly- 
nomial of A given as column vectors, respectively. Let p™ ANKY (cc) and p^ RK (x) 
denote the actual characteristic polynomials, with coefficients computed by the 
respective algorithms. 



Newton's symmetric polynomials arc defined as follows: sq = 1, and for 
1 < k < n, by: 

k 

Sk = lj2{-lf- 1 s k -MA i ) (3) 

Then, p°f ANKY (:r) = s x n - six™" 1 + s 2 z n ~ 2 ± s n x°. It is shown in the proof 

of lcmmanhow Csanky's algorithm computes the s^'s more efficiently (in NC 2 ) 
than in the straightforward way suggested by the recurrence © . 

Lemma 1. p^ SANKY can be given as a term of LAP. 

Proof. We follow the ideas in ^2 Section 13.4]. We restate in matrix form: 
s = Ts — b where s, T, b are given, respectively, as follows: 



\s n J 



Then s = —b(I — T) -1 . Note that (/ — T) is an invertible matrix as it is lower 
triangular, with Is on the main diagonal. The inverse of (I — T) can be computed 
recursively using the following idea: if C is lower-triangular, with no zeros on 
the main diagonal, then 

There are 0(log(n)) many steps and the whole procedure can be simulated with 
circuits of depth 0(log 2 (n)) and size polynomial in n. 

This, however, does not give us an LAP-term, and it would be difficult to 
formalize the proof of correctness of this recursive inversion procedure in LAP. 
Thus, instead of this recursive computation, we use the fact that the CHT can 
be proved correct in LAP for triangular matrices (see [3 Section 5.2]). From 
the characteristic polynomial of (/ — T) we obtain its inverse, and the inverse 
can be proved correct (i.e., (I - T){I - T)" 1 = (J - T)~ l (I - T) = I) using the 
the CHT for triangular matrices, and this can be formalized in LAP. 

Bcrkowitz's algorithm, just as Csanky's algorithm, allows us to reduce the 
computation of the characteristic polynomial to matrix powering. Its advantage 
is that it works over any field; however, certain properties (such as the fact that 
similar matrices have the same characteristic polynomial) have easy proofs in 
weak theories (LAP) for Csanky's algorithm, but (seem to) require polytime 
theories (3LA) for Berkowitz's algorithm. 
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Bcrkowitz's algorithm computes the characteristic polynomial of a matrix in 
terms of the characteristic polynomial of its principal minor: 



a =("sm) W 

where R is an 1 x (n — 1) row matrix and S is a (ra — 1) X 1 column matrix and 
M is (n — 1) X (n — 1). Let and g(x) be the characteristic polynomials of A 
and M respectively. Suppose that the coefficients of p form the column vector 

V = (Pn Pn-1 ■■■Po) (5) 

where pi is the coefficient of x l in det(x/ — A), and similarly for q. Then: 

P = C\q (6) 

where C± is an (n + 1) x n Toeplitz lower triangular matrix (Toeplitz means 
that the values on each diagonal are constant) and where the entries in the 
first column are defined as follows: en = 1 if i = 1, a\ = —an if i = 2, and 
Cji = — (RAP~ 3 S) if i > 3. Berkowitz's algorithm consists in repeating this for 
q, and continuing so that p is expressed as a product of matrices. Thus: 

P T K ^C 1 C 2 ---C n (7) 

where C, is an (n + 2 — i) x (n + 1 — i) Toeplitz matrix defined as above except 
A is replaced by its i-th principal sub-matrix. Note that C n = (1 — a nn )'. 

Since each element of Cj can be explicitly defined in terms of A using matrix 
powering, and since the iterated matrix product can be reduced to matrix pow- 
ering by a standard method, the entire product Q can be expressed in terms of 
A using matrix powering. Thus the right-hand side of J7J) can be expressed as a 
term in LAP. 

Since we can define the characteristic polynomial in LAP (as p CSANKY or 
p BERK ), it follows immediately that we can also define the determinant and the 
adjoint as terms of LAP. 



4 Correctness of Csanky's Algorithm 

The main result of this section, given as theorem ^ is the following: 

QLAP h Linear Independence D CHT (8) 

where CHT (the Cayley-Hamilton Theorem) stands for p A (A) = pf AtiKY (A) = 0. 
Since 3LA proves the principle of linear independence (see [lOp. we have a 
new proof that 3LA can prove the CHT. We assume that the characteristic 
polynomial of A, pa, is computed with Csanky's algorithm, i.e., in this section 



Lemma 2. LAP proves that similar matrices have the same characteristic poly- 
nomial; that is, if P is any invertible matrix, then p a — Ppap- 1 ■ 

Proof. Observe that tr(AB) = £\ £\ a^b^ = £\ J2i b 3i a ij = ^r(BA), so using 
the associativity of matrix multiplication, tr(PA t P~ 1 ) = ti(A l PP^ 1 ) = tr(A l ). 
Inspecting Pjl. we see that a proof by induction on the Si proves this lemma. 



Lemma 3. LAP proves that if A is a matrix of the form: 

B 
C D 



(9) 



where B and D are square matrices (not necessarily of the same size), and the 
upper-right corner is zero, then pa(x) — pb{x) -pd(x). 

Proof. Let sf,s B ,sf be the coefficients of the characteristic polynomials (as 
given by ||3J|) of A, B, D, respectively. We want to show by induction on i that 



E 

j-\-k—i 



s j s k > 



from which the claim of the lemma follows. The Basis Case: Sq = s B = s B = 1. 
For the Induction Step, by definition and by the induction hypothesis, we have 
that sf +l equals 



= ^(-l)^^tr(^ +1 ) = ^(-l) 
and by the form of A (i.e., JSJ): 



E 4 

p+q=i-j 



B S D 

b q 



tr(yV +1 ) 



E(-d 

3=0 



E 



s B s D 

*p *q 



(tx(B j+1 )+tr(D j+1 )) 



p+q=i-j 

to see how this formula simplifies, we divide it into two parts: 



E(-d 

3=0 



E 

p+q=l-j 



s B s D 
b p b q 



tr(B 



3 + M 



E(-d j 

3=0 



E 

p+q=l-j 



s B s D 

°p °q 



tr(Z> 



Consider first the left-hand side. When q — 0, p ranges over {i, i— 1, . . . , 0}, and 
j + 1 ranges over {1, 2, . . . , i + I}, and therefore, by definition, we obtain s B +1 . 
Similarly, when q = 1, we obtain s^, and so on, until we obtain sf . Hence we 
have: 



E 



s B -s D 

**-3*J 



+ Em) j 



E 



''p °q 



tr(Z^ +i ) 



The same reasoning, but fixing p instead of q on the right-hand side, gives us: 



i+i 

E 

3=0 



s B s D 



'(+1 

+ E 

3=0 



s B s D 



E 



s j s k 



which gives us the induction step and the proof of the lemma. 

To show that pa {A) = it is sufficient to show that pA(A)ei — for all 
vectors ei in the standard basis {ei, e-i, . . . , e„}. Let k be the largest integer such 
that 



{ei, At 



A 



k-l. 



i} 



(10) 



is linearly independent; we know that k — 1 < n, by the principle of linear 
independence (this is the first place where we use linear independence). Then, 
QIC) [I is a basis for a subspace W of F™, and W is invariant under A, i.e., given 
any w £ W, Aw £ W. 

Using Gaussian Elimination we write A k ti as a linear combination of the 
vectors in (|10|) . Using the coefficients of this linear combination we write a monic 
polynomial 

g{x) = x k + c 1 x k ~ 1 H hcfcx (11) 

such that g(A)ei = 0. 

Let Aw be A restricted to the basis (|l(Jfl . that is, Aw is a matrix represent- 
ing the linear transformation Ta ■ F™ — > F™ induced by A, restricted to the 
subspace W. The matrix A w has the following simple form: 
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00 . 





1 . 




. 



-c k \ 



-Cfc_i 

-Cfe_ 2 

1 -a J 



(12) 



i.e., it is the companion matrix of the polynomial g(x). Since pa = Pa 1 -, w e 
consider the transpose of A w , since A^ has the property that its principal 
submatrix is also a companion matrix, and that will be used in a proof by 
induction in the next lemma. 

The proof of the next lemma is the crucial technical result of this section. 
The proof is given in the appendix. 

Lemma 4. LAP proves that the polynomial g(x) is the characteristic polyno- 
mial of Aw, in other words, g(x) =pa w {x). 

It is interesting to note that lemma 01 can also be proved (feasibly) for 
Berkowitz's algorithm instead, and the proof is in fact much simpler: consider 
again the matrix given by l|12|l . We assume inductively that p™ RK (the character- 
istic polynomial of the principal submatrix of \12\ \ is given by ( 1 c\ C2 ... cj.-i )*. 
Since R = (0 ... -c k ) and S = e-y, p A ERK = B ■ p^ RK , where B (the matrix 



given by Berkowitz's algorithm) is an (n + 1) x n matrix with Is on the main 
diagonal, Os everywhere else, except for +Ck in position (n + 1, 1). From this, it 
is easy to see that p A ERK is given by ( 1 c\ C2 ... Cfc )*. 

As was pointed out in the introduction, if we managed to prove in LAP that 
Csanky's and Berkowitz's algorithms compute the same thing (i.e., p CSANKY = 
p BERK ) we would have an LAP proof of the CHT for both. The reason is that 
the CHT follows for Berkowitz's algorithm from det(A) = det(PAP~ 1 ), which 
is trivial to prove for Csanky's algorithm (see proof of Lemma 0). 

Lemma 5. 3LA proves that the polynomial g(x) divides pa{x). 
Proof. Extend (TJJJ to a full basis of F": 



This extension can be carried out easily with Gaussian Elimination, by checking 
which vectors from the standard basis ({ei , ea, . . . , e n }) are in the span consisting 
of IjlUI) and those vectors that have already been added, and adding only those 
that are not. This is the only other place (besides the paragraph following the 
proof of lemma |3J) where we need to use the principle of linear independence. 
Let P be the change of basis for A from the standard basis to B. Then, 



where A\y is a k x k block, and E is a [n — k) x (fc — n) block (corresponding 
to the extension), and we have a block of zeros above E since W is invariant 
under A. By lemma[3]it follows that pa{x) = Ppap-^{x) = Pa w ( x ) ~Pe(%)- By 
lcmma01 pa w — g{x), and so g(x) divides pa{x). 

Theorem 1. QLAP proves the Cayley- Hamilton Theorem (CHT) from the 
principle of linear independence, when the characteristic polynomial is computed 
by Csanky's algorithm. 

Proof. By lemma |SJ 



PA {A)e l = (jp Aw {A) ■ p E {A))ei = [g{A) ■ p E (A)) ei =p E (A) ■ {g{A) ei ) = 0. 



Since this is true for any in the standard basis, it follows that pa(A) = 0. 

The proof of the multiplicativity of the determinant is a 3LA corollary of 
this theorem, as can be seen in [S]. Together, the CHT and the multiplicativity 
of the determinant, are two powerful universal principles of linear algebra from 
which many others follow directly. An important open question remains: are 
they provable in LAP? 



B — {ei, Ae-i, . . . , A 



fc-i 



■■■> e i„-J- 




5 Equivalence of Matrix Principles 



Consider the following five central principles of linear algebra: 

1. The Cay ley-Hamilton Theorem 

2. {3B ^ 0)[AB = I V AB = 0} 

3. Linear Independence (n + 1 vectors in F n must be linearly dependent) 

4. Weak Linear Independence (n k vectors (n, k > I) in F n must be linearly 
dependent) 

5. Every matrix has an annihilating polynomial 

In this section we are going to show that QLA proves their equivalence. Fur- 
thermore, we show that these principles are independent of QLA. Thus, even 
though QLA is strong enough to show them equivalent, it is too weak to prove 
any of them. 

Notice however that QLA does not have the matrix powering function, yet 
two of these principles, namely 1 and 5, require matrix powering to be stated. 
Let POW(A, n) be the formula: 

3(X Xx . . . X n )(\/i < n)[X = I A (i < n D X i+1 = X t * A)] (13) 

The size of {XqX\ . . . X n ) can be bounded as it is a r(A) x (t(A) • (n+1)) matrix. 
(The abuse of notation in (|13|l is for better readability, but this formula can be 
stated formally as a bounded E\ formula of QLA.) 

Theorem 2. The jive principles of linear algebra can be proved equivalent in 
QLAP with POW(A, n). 

Proof. 3 implies 1 because of the results of the previous section. Note that here 
we need fields of characteristic zero (because of Csanky's algorithm). It is an 
open question whether we can prove this over arbitrary fields — for example in 
the context of Berkowitz's algorithm. 

1 implies 2 because B is just the adjoint, for which we have the desired 
properties from the Cayley-Hamilton Theorem. 

2 implies 3, because suppose that we have (n + 1) vectors in F™, and that 
they are linearly independent. Let A be the n x (n + 1) matrix whose columns 
are these n + 1 vectors. Let A' be the matrix resulting by appending a row of 
zeros to A. Since the vectors are linearly independent, there is no B such that 
A'B = 0, so by 2 there must be a B such that A'B — I; but that is not possible, 
given that the last row of A' is zero. 

3 obviously implies 4. 

4 implies 5 because we can look at {/, A, A 2 , ... , A n }, where A is n x n, and 
k as large as we want, and as vectors these matrices are linearly dependent by 4. 

5 implies 2, because if p(A) — 0, we can choose the largest s such that 
p{A) = q(A)A s . If q{A) ^ 0, we choose the largest k < s so that q(A)A k ^ 0, 
and this is our zero divisor for A. If q(A) = 0, then it has a non-zero constant 
coefficient, and hence we can obtain from q(A) the inverse for A. 



Recall that the Steinitz Exchange Theorem (SET) says the following: 
if T is a (finite) total set for a vector space V, i.e., span(T) = V, and E is 
a linearly independent set, then there exists an F C T, such that |F| = \E\, 
and (T — F) U E is total. (Note that in general, SET is stated for any T, not 
necessarily finite, but here we assume that T is finite.) 

We can state SET in the language of QLA as follows: associate the finite 
set T of to vectors in F™ with a n x to matrix T, and we can state that T is 
total with (3 A) [TA = I] . Let E be a n x fc matrix representing the fc vectors in 
E 1 . We want to find k column in T, and replace them by E. We can state that 
there exists a permutation matrix so that TP has those k columns as the last k 
columns. Using the A-constructor, we can "chop of" those last k columns, and 
replace them by E, and then state that the result is also total. Thus, SET can 
be stated in QLA. 

Lemma 6. QLA proves that the Steinitz Exchange Theorem implies the five 
principles listed at the beginning of this section. 

Proof. We show that SET implies (in QLA) the existence of an annihilating 
polynomial. Consider the set E = {I, A, A 2 , A 3 , . . . , A n where A is an n x n 
matrix. If E is linearly dependent, we are done: we have an annihilating poly- 
nomial. Otherwise, suppose that E is linearly independent. 

Let V = M nx „(F), that is V is the vector space ofnxn matrices, over some 
field F (note that our argument is field independent). Let T — {^»j}i<*,j<nj 
that is, T is the set of all elementary matrices ey, which are matrices with 1 in 
position but zeros everywhere else. Note that |T| = \E\ = n 2 , and T is 

clearly total. 

Therefore, by the Steinitz Exchange Theorem, (T — F) U E is total for some 
\F\ = \E\, and so E is total since T = F if \T\ = \E\ = n 2 . If E is total, then 
A n e span(E), and hence E U {A n } is linearly dependent, and so we have an 
annihilating polynomial once again. 

Can we show that the five principles, listed at the beginning of this section, 
prove (in QLA) the SET? Here is an obvious proof of SET: pick E\ in E, and 
since T is total, we can write it as a linear combination of elements in T, say 
Ei = a\T\ + a?Ti + ■ ■ ■ a n T n , all aj ^ 0. So, T\ can be written as a sum of 
elements in T-{Ti}U{^i}. So, put 7\ in F. Note that T — {Ti}u{Ei} remains 
total. Now pick E 2} and write it as a linear combination of a finite subset of 
elements in T — {Ti} U {-Ei}- By the assumed linear independence of E, E 2 
cannot be written in terms of E\ alone, so like before, we can pick some T 2 and 
put it in F. We proceed inductively, at each step putting some Tj in F. 

The problem with the proof outlined above is that it requires induction over 
formulas with matrix quantifiers, which we do not have in QLA (on the other 
hand, this proof could be easily formalized in 3LA). Thus we propose the fol- 
lowing open problem: can SET be proved in QLA from the five principles? More 
generally: can Gaussian Elimination, properly stated, be shown correct in QLA 
from the five principles? 



We conjecture that the answer is "yes" to those two questions, and that they 
are not too hard to prove. 

Lemma 7. QLA h (3B ^ 0)[AB = I V AB = 0] D POW{A, n). 

Proof. We use reduction of matrix powering to matrix inverse described in 3 . 
Let N be the n 2 x n 2 matrix consisting of n x n blocks which are all zero except 
for (n — 1) copies of A above the diagonal zero blocks. Then N n = 0, and 
(7 - N)- 1 = I + N + N 2 + ... + N"- 1 = 



Set C = I — N. Show that if CB = 0, then B = 0, using induction on the rows of 
B, starting with the bottom row. Using (Eli? ^ 0)[CB = I V CB = 0], conclude 

that there is a B such that CB = I. Next, show that B = I+N+N 2 -\ h-/V™~\ 

again, by induction on the rows of B, starting with the bottom row. Thus, B 
contains I , A, A 2 , . . . , A 11 ^ 1 in its top rows, and POW(A, n) follows. 

Thus, not every implication in theorem [2] requires POW(A, n). In particular, 
2 3 and 3 => 4 can be shown in QLA (for 2 <^> 3 see proof of corollary below) . 
It is an open question whether 4 implies 3 in QLA. 

Lemma 8. QLA Y- POW(A, n). 

Proof. We can turn QLA into a three-sorted universal theory in the style of 
QPV (Hj), by introducing function symbols for all the A-terms, so we have 
number-valued functions, field-valued functions, and matrix valued- functions. 
Further, if the underlying field is GF(2), then all these functions are in the 
complexity class AC [2] (by translations given in Hence, by the Herbrand 
Theorem, every existential theorem of QLA can be witnessed by an AC [2] 
function. 

Let DET(GF(2)) be the complexity class of functions NC 1 reducible to the 
determinant over GF(2). This class is equal to the class POW(GF(2)), by results 
in PJ. On the other hand, AC°[2] is properly contained in DET(GF(2)), since 
L C DET(GF(2)) (see g]), while MAJORITY e L but it is not in AC°[2] 
(see |5I5]1. 

Corollary 1. QLA does not prove the principles 2 and 3 (while it can show 
them equivalent without POW(A,n)). 

Proof. By lemmas [7] and |H1 we see that QLA does not prove 2. Now, 3 implies 
2 by the following argument: take A and add ej (the elementary column vector 
with 1 in the i-th entry, and zeros everywhere else) as the last column. By linear 
independence, we know that there exist bu, &2ij • • • > &(n+i)i) n °t a ^ zero, such 
that bnAi + bnAi + ■ ■ ■ b n iA n + 6( n -|-i)jej = 0, where A4 is the i-th column of A. 
If for all i, is not zero, we found B such that AB — I. If, on the other 

hand, some bi n +i)i = 0, then B consisting of columns given by \bub2i ■ ■ ■ b n i] is 
a zero divisor of A, i.e., AB = 0. 
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6 Conclusions and Open Problems 



We gave a new feasible proof of the Cayley-Hamilton Theorem via Csanky's 
algorithm. The new proof requires fields of characteristic zero, but it shows that 
the CHT follows in LAP from the principle of linear independence. It is an 
open question whether the CHT follows in LAP from the principle of linear 
independence over general fields. 

We showed that five important principles of linear algebra can be shown 
equivalent in QLA, and using a previously known separation of complexity 
classes (namely AC [2] C DET(GF(2))) we showed that none of these prin- 
ciples is provable in QLA. 

It is an interesting open problem whether the principles listed in theorem [3 
can be proved in QLA + POW(A,n). Likewise, it is an open problem whether 
Bcrkowitz's and Csanky's algorithm are provable correct in LAP (they can be 
stated in LAP, and weak properties of correctness are provable in LAP). 

Acknowledgments: The author would like to thank Stephen Cook for pointing 
out the proof of the Cayley-Hamilton Theorem in which is the basis for the 
proof in section The material in section [3] came from discussions with Mark 
Braverman and Stephen Cook. Finally, the author is grateful to the anonymous 
referees, especially to the referee who succinctly and elegantly expressed the 
contribution of this paper (see the first sentence of the introduction) . 
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7 Appendix 

Proof (lemma^l. We will drop the W from Aw as there is no danger of confusion 
(the original matrix A does not appear in the proof); thus, A is a k x k matrix, 
with Is below the main diagonal, and zeros everywhere else except (possibly) in 
the last column where it has the negations of the coefficients of g{x). 

As was noted above, A is divided into four quadrants, with the upper-left 
containing just 0. Let R = (0 ... —c k ) be the row vector in the upper-right 
quadrant. Let S = e\ be the column vector in the lower-left quadrant, i.e., the 
first column of A without the top entry. Finally, let M be the principal submatrix 
of A, M = A[l|l]; the lower-right quadrant. 

Let So, Si, . . . , sic be the Newton's symmetric polynomials of A. 

To prove that g(x) — Pa Tw i x ) we prove something stronger: we show that 
(i) for all < % < k (-1)^ = a, and (ii) p A (A) = 0. 

We show this by induction on the size of the matrix A. Since the principal 
submatrix of A (i.e., M) is also a companion matrix, we assume that for i < k, 
the coefficients of the symmetric polynomial of M are equal to the Cj's, and that 
Pm(M) = 0. (Note that the Basis Case of the induction is a 1 x 1 matrix, and 
it is trivial to prove.) 

Since for i < k, tr(A z ) = tr(M' 1 ), it follows from © and the induction 
hypothesis that for i < k, (— l) l Si = Ci (note that sq — cq = 1). 

Next we show that (— l) k Sk = Cfe. By definition (i.e., by ©) we have that Sk 
is equal to: 

±(s k -itr(A) - s k -2tr(A 2 ) + ■■■ + (-l) k ~ 2 s^A 1 *- 1 ) + (-l)^ 1 So tv(A k )) 

and by the induction hypothesis and the fact that for i < k tr(A l ) = tc(M l ) we 
have: 

=^(-l) fe - 1 (c fe -itr(M) + c fc - 2 tr(A/ 2 ) + ■ ■ • + cit^M*" 1 ) + c tr(A fe )). 
Note that tr(A fe ) = -kc k + tr(Af fc ), so: 

=^(- 1 ) fe " 1 [cfc-itr(M) + c fc _ 2 tr(Af 2 ) + • • • + c 1 tr(Af' £ - 1 ) + c tr{M k )] 
+ (-l) k c k 
Observe that 



tr(c fe _iM + c fc _ 2 Af 2 + • • ■ + Cl M k - x + c M k ) = tr(p M (M)M) = tr(0) = 



since pm(M) = by the induction hypothesis. Therefore, s& = (— l) fe Cfe. 

It remains to prove that pa{A) ~ Ej = o Ci^ - * = 0- First, show that for 
l<i <(&-!): 



/ 



A 



i+i 



APS 



V 



RAP 



Ej=o MiSRM^-V-i + M l+1 



(14) 



(For A of the form given by 112(1 , and i?, S, M defined as in the first paragraph 
of the proof.) Define Wi,Xi,Yi, Zi as follows: 



li+i 



Yi+l 

ZiS YiR + ZiM 



R 

S M 



(15) 



We want to show that the right-most matrix of (|15|l is equal to the right-hand 
side of (|14f) . First note that: 



X i+X =Y,Wi-jRM j 



3=0 



(16) 



With the convention that wo = 1. See (SJ lemma 5.1] for an LAP-proof of (|16l) . 
Since wi = 0, a straight-forward induction shows that w^+i = 0. Therefore, at 
this point the right-most matrix of (|15J) can be simplified to: 

RAP 
Z,S YiR + ZiM 

Again by 8, lemma 5.1] we have: 

z-2 i-l 

Y i+1 = APS + ^2(RM'S)Y i ^ j Z i+1 = AP+ 1 + ^ V, , ,PM* 

3=0 j=0 

By the same reasoning as above, J^'j^oi-R-M 3 ^O^i-i-j = 0, so putting it all 
together we obtain the right-hand side of (11411 . 

Using the induction hypothesis (pm(AI) = 0) it is easy to show that the first 
row and column of pa(A) are zero. Also, by the induction hypothesis, the term 
M l+1 in the principal submatrix of pa(A) disappears but leaves c^l. Therefore, 
it will follow that pa (A) = if we show that 



k i-2 

Ck-i J2 M j SRM^-i 

3=0 



(17) 



i=2 



is equal to —c^l. 



Some observations about l|17|): for < j < i — 2 < k — 2, the first column of 
AP is just e^+i. And S*i? is a matrix of zeros, with — Ck in the upper-right corner. 
Thus APSR is a matrix of zeros except for the last column which is — cuej+i- 
Thus, M 3 SRM^ 2 ^ 3 is a matrix with zeros everywhere, except in row (j + 1) 
where it has the bottom row 

of M (i-2)-j mu itiplied by -c h . Let m^ 2 ^ denote 
the 1 x (fc — 1) row vector consisting of the bottom row 

of M (i- 2 )-i_ Therefore, 

<(T7|) is equal to: 



"Cfe • 



(i-3) 



(18) 



We want to show that l|18|l is equal to — c^J to finish the proof of pa(A) = 0. To 
accomplish this, let I denote the l-th row of the matrix in l)18|l starting with the 
bottom row. We want to show, by induction on I, that the Z-tli row is equal to 
eu-i- 

The Basis Case is I = 0: 

k 

^CMm (i_i) = c m° = e k , 

i—k 

and we are done. 

For the induction step, note that m i+1 is equal to m' shifted to the left by 
one position, and with 

m' • ( -c fe _i -c fe _2 • • • -ex )* (19) 

in the last position. We introduce some more notation: let r; denote the k — I row 
of 1|18|) . Thus r; is 1 x (fc — 1) row vector. Let r; denote r; shifted by one position 
to the left, and with a zero in the last position. This can be stated succinctly in 
LAP as follows: 

n d ^Azj(l,(fc-l),e(r ; ,l,i + l))}. 
Based on (|18|l and l|19l) we can see that: 

17+1 =r 1 +[ri ■ ( -Cfe_i -Cfe-2 • • • -ci Y]e k + cim°. 

(Here the "•" in the square brackets denotes the dot product of the two vectors.) 
Using the induction hypothesis: r 1— e fe _(; +1 ), and 

ri ■ ( -Cfc_i -c fc _ 2 • ■ • ~C\ Y = e fc _; • ( -c fe _i -c fc _ 2 • ■ ■ -ci )* = -c ; 

so r; + i = efc_; — c;efc + c;efc = efc-(z+i) as desired. This finishes the proof of the 
fact that the matrix in 1)18(1 is the identity matrix, which in turn proves that l(17J) 
is equal to — c^I, and this ends the proof of pa(A) = 0, which finally finishes the 
main induction argument, and proves the lemma. 



